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AP® STATISTICS 
2005 SCORING COMMENTARY (Form B) 


Question 1 


Sample: 1A 
Score: 4 


This is a complete response using clear prose and a visual display. Although some extraneous information about 
the spread and median of the distribution is given, the response to part (a) clearly indicates the distribution is 
slightly skewed to the left. The median is correctly chosen as the larger value in part (b), and this choice is 
supported with rationale that the mean is pulled toward the more extreme observations in the left tail of the 
distribution. The display of a left-skewed density function with labels for the mean and median contributes to the 
clarity of the communication. The midrange is correctly evaluated in the response to part (c). Identification of the 
midrange as a measure of center is supported by the rationale that the midrange is a central value in the sense that 
itis halfway between the minimum and maximum exam scores. 


Sample: 1B 
Score: 3 


This is a substantial response that is clearly communicated. The response to part (a) correctly indicates that the 
distribution is skewed to the left. The median is correctly chosen as the larger value in part (b) with justification 
that the mean will be pulled down toward the low values because the distribution is left-skewed. In part (c) the 
midrange is correctly evaluated, but the midrange is incorrectly identified as a measure of spread. The student 
appears to have confused a measure of center with the center of the distribution. The student correctly points out 
that the midrange may not be the mean of a skewed distribution but incorrectly concludes that this disqualifies 
the midrange as a measure of center. The only choice remaining is to incorrectly identify the midrange as a 
measure of spread. The mean is not the only measure of center. The median is also a measure of center, for 
example, and it is not equal to the mean for skewed distributions. 


Sample: 1C 
Score: 1 


This response correctly identifies the left skewness of the distribution, but it exhibits confusion about the effect 
of skewness on the relative values of the mean and medians. It also fails to distinguish between the different ways 
in which the minimum and maximum are used in the construction of the midrange and the range. The response to 
part (a) correctly indicates that the distribution is skewed toward the low test scores and provides some additional 
information about the mode. The mean is incorrectly chosen as the larger value in part (b). The student 
incorrectly believes that the mean will be larger than the median because test scores in the upper 80s and 90s 
occur more frequently than do tests scores in any category below the upper 80s. The relatively large distances of 
the 60s categories from the median are not appreciated. The midrange is correctly evaluated in part (c), but this 
student considers it a measure of spread because it is evaluated from the maximum and minimum values. This 
student fails to properly distinguish the midrange from the range. 
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Question 2 


Sample: 2A 
Score: 4 


This is a complete response that also displays excellent organization and communication. The mean and standard 
deviation for the distribution of the number of child tickets purchased by individual customers are correctly 
evaluated in the response to part (a). Inserting appropriate numbers into mean and variance formulas provides 
sufficient supporting work. The response clearly shows that the standard deviation is computed as the square root 
of the variance. In part (b) the student clearly indicates how the mean and variance for the sum are evaluated 
without introducing new notation. The operation of taking the square root of the variance to obtain the standard 
deviation of the sum is also clearly communicated. The response to part (c) provides an especially nice 
description of the evaluation of the mean and standard deviation of the purchase amounts for child and adult 
tickets before moving on to the evaluation of the mean and standard deviation of the total. The calculations are 
correct and nicely labeled. 


Sample: 2B 
Score: 3 


This is a substantial response that is clearly communicated. The mean of the number of child tickets purchased by 
individual customers is correctly evaluated in the response to part (a) and supported with appropriate numerical 
values inserted. The formula used for the computation of the standard deviation is incorrect, although this student 
realizes that the standard deviation is the square root of the variance. Displaying appropriate formulas and 
inserting appropriate values provide sufficient supporting work. This student introduces notation 4. and o,, for 


the mean and standard deviation of C. This is standard notation, and it is defined by formulas. There is some 
notational inconsistency in the formulas. In both formulas x; is introduced for values of the random variable C, 
and is used instead of jv, in the formula for the variance. This poor communication was overlooked in scoring 


the response because the intent of the student is clear from the display of the formulas with the numerical values 
inserted. The student clearly shows that the standard deviation is computed as the square root of the variance. In 
part (b) the response clearly indicates how the mean and variance for the sum are evaluated. New notation, 1, 
and o, is introduced for the mean and standard deviation of the number of adult tickets. The student should have 
defined this new notation, but it was accepted in the scoring of this response because it is standard notation and 
consistent with the notation introduced in part (a). The mean is correctly computed in the response to part (c), but 
this student fails to square the ticket costs in computing the variance of the total amount spent in part (c). 
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Question 2 (continued) 


Sample: 2C 
Score: 1 


The student is able to correctly work with means in some circumstances but has difficulty working with variances 
and standard deviations. The mean of the number of child tickets purchased by individual customers is correctly 
evaluated in the response to part (a) and supported with a formula showing correct numerical values. The formula 
used for the computation of the standard deviation is incorrect, although this student realizes that the standard 
deviation is the square root of the variance. In part (b) the calculations of the mean and standard deviation for the 
total number of child and adult tickets purchased are both incorrect. The calculation of the standard deviation 
uses the correct standard deviation for the number of adult tickets and the incorrect value for the standard 
deviation of the number of child tickets computed in the response to part (a). Using an incorrect value computed 
in a previous part was not scored as an error. The student understands that the variance of a sum of two 
independent random variables is the sum of the individual variances but incorrectly weights the variances in the 
computation. The mean of the total amount spent on purchases is correctly computed in the response to part (c). 
The standard deviation is incorrectly computed as the sum of the standard deviations for the individual total 
purchase amounts for child and adult tickets. 
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Question 3 


Sample: 3A 
Score: 4 


This response describes completely randomized and matched-pairs designs and identifies appropriate inference 
procedures. To randomly assign participants to treatments for the completely randomized design in part (a), this 
student first labels the 100 subjects with three-digit numbers from 001 through 100. The first 50 numbers 
randomly generated with a calculator identify the 50 participants who receive the new compound, and the other 
50 participants receive the current compound. This could have been more clearly stated. The response also 
indicates that a two-sample t-test should be used to compare the mean number of mosquito bites for the new and 
current compounds. The student states the null and alternative hypotheses and correctly suggests a one-sided 
alternative. In the response to part (b) the student describes an experiment in which each participant uses both 
compounds. The order in which the compounds are tested is randomized by a coin toss for each participant. The 
student goes on to suggest a matched pairs f-test and specifies the correct null and alternative hypotheses for the 
mean difference. The one-sided alternative hypothesis is appropriate because the scientists want to show that the 
new compound is more effective than the current compound. Part (c) correctly selects the matched-pairs design in 
part (b) as the better design based on a rationale that alludes to subject-to-subject variability in susceptibility to 
mosquito bites. Although several points could have been more clearly stated, this is an essentially complete 
response. 


Sample: 3B 
Score: 3 


This response effectively communicates knowledge of completely randomized and matched-pairs designs. It 
provides a correct inference procedure for the completely randomized design but fails to identify a test statistic 
for the matched-pairs design. Randomization schemes are completely described. To randomly assign participants 
to treatments for the completely randomized design in part (a), the student labels the 100 subjects with two digit 
numbers and explains how to use a random number table to select 50 participants for each treatment group by 
moving across the rows of a random number table until 50 unique two-digit numbers are found. The 50 
participants with those 50 labels receive the new compound, and the other 50 participants receive the current 
compound. This response also clearly indicates that mosquito bites are recorded for the participants, and mean 
numbers of mosquito bites for the new and current compounds will be compared with a two-sample f-test. 
Statements of the null and alternative hypotheses are provided. Since the scientists want to show that the new 
compound is more effective than the current compound in preventing mosquito bites, the one-sided alternative 
hypothesis is appropriate. In part (b) this response makes use of the randomization procedure described in part (a) 
to randomly select 50 subjects who will use the new compound on their right arm and the current compound on 
their left arm. The other 50 subjects receive the other allocation of the two treatments to their two arms. The 
student goes on to indicate that the difference in numbers of mosquito bites for the two compounds will be 
computed for each participant and discusses testing a hypothesis but fails to identify a test statistic. This response 
also does not indicate how the 100 participants will use the 100 bins of mosquitoes. Part (c) correctly selects the 
matched-pairs design in part (b) as the better design and provides a discussion that indicates that this student is 
aware that the proposed matched-pairs design will control subject-to-subject variability in susceptibility to 
mosquito bites. 
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Question 3 (continued) 


Sample: 3C 
Score: 1 


This response indicates some knowledge of completely randomized and matched-pairs designs, but the 
description of randomization procedures is incomplete and there is no mention of inference procedures. Part (a) 
indicates that subjects should be randomly assigned to bins and randomly assigned to treatment groups but does 
not provide any description of the randomization procedures. This response does not mention a two-sample f-test 
to compare mean numbers of bites for the two compounds. It is not sufficient to simply indicate that results 
should be compared. Part (b) only indicates that subjects should be randomly assigned to bins. The student 
addresses neither random assignment of the two treatments to the participant’s two arms nor randomizing the 
order in which the arms are tested. The student also fails to mention an inference procedure. Part (c) correctly 
selects the matched-pairs design in part (b) as the better design and alludes to potential variability in bins of 
mosquitoes as well as person-to-person variability in susceptibility to mosquito bites. 
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Question 4 


Sample: 4A 
Score: 4 


This is a brief but complete response that is clearly communicated. In part (a) the student identifies the 
appropriate confidence interval by correctly stating the formula. The student specifies a confidence level and 
states the correct degrees of freedom. Numbers are correctly substituted into the formula, and the correct interval 
is computed. The student correctly interprets the interval, although the linkage to the context of the problem is 
minimally sufficient. In part (b) the student identifies a correct pair of hypotheses and uses the confidence 
interval to state a correct conclusion in the context of the problem. The linkage to the context of the problem is 
more complete in part (b) than in (a). 


Sample: 4B 
Score: 3 


This is a substantial response that is clearly communicated, but there is an error in the mechanics. In part (a) the 
student identifies the appropriate confidence interval by substituting numbers into the formula. The student 
incorrectly believes that n is 24, rather than 12. This is reflected in the denominator of the square root in the 
standard error and in the degrees of freedom used to find the f critical value. The student drops the negative sign 
on the reported mean of the differences but correctly states the direction of the difference in the interpretation of 
the interval. The interpretation of the interval is correct and includes a nice linkage to the context of the problem. 
In part (b) the student identifies a correct pair of hypotheses and uses the confidence interval to state a correct 
conclusion. Although the conclusion itself does not refer to the context of the problem, the context is provided 
above in the statement of the hypotheses. 


Sample: 4C 
Score: 2 


This is a developing response with an error in mechanics and a weak interpretation of the confidence interval. In 
part (a) the student identifies the appropriate confidence interval by correctly stating the formula and specifies a 
confidence level. Despite mentioning ¢ in the formula, the student uses the z-value of 1.96 when computing the 
interval. The student correctly interprets the interval but without the required linkage to the context of the 
problem. In part (b) the student uses the confidence interval to state a correct conclusion, although the linkage to 
the context of the problem is minimally sufficient. 
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Question 5 


Sample: 5A 
Score: 4 


In part (a) the correct formula for the estimated regression line is reported, and the variables are clearly defined. 
The student clearly realizes the estimated regression line provides estimates of pulse rate for various walking 
speeds. In part (b) the student clearly indicates that John’s pulse rate would be expected to be close to the 
estimated intercept (63.457 bpm) when his walking speed is zero. This conveys the notion of the estimated 
intercept as an estimate of John’s mean pulse rate when he is not walking. Both the estimated intercept and the 
estimated slope are interpreted in the context of the problem using appropriate units of measurement. The margin 
of error is correctly evaluated in the response to part (c). The student clearly shows that the t-value is based on 5 
degrees of freedom. 


Sample: 5B 
Score: 3 


The response to part (a) does not report the estimated regression line in the context of the problem, nor does it 
define the X and Y variables used in the formula. The response to part (b) provides interpretations of both the 
estimated intercept and the estimated slope in the context of the problem. Appropriate units of measurement are 
used in the interpretation of the slope, but bpm is omitted from the interpretation of the intercept. The 
interpretation of the slopes uses “increases on average by” to indicate that the slope is an average rate of increase 
and “heart rate is around” to indicate that the intercept is a prediction of John’s resting heart rate. The 
communication of these concepts could have been better. The margin of error is correctly evaluated in the 
response to part (c) and the supporting work is shown. 


Sample: 5C 
Score: 2 


In part (a) the estimated regression line is correctly reported in terms of the speed and pulse rate variables. The 
response to part (b) is incomplete. It provides correct algebraic interpretations of the intercept and slope of a line 
in the context of the problem, but it fails to convey the idea that the values reported in the stem of the question 
are estimates of means. This response is also incomplete because it fails to provide numerical values and units of 
measurement. A 98 percent critical value from the normal distribution is used to compute the margin of error in 
part (c) instead of the appropriate f-value. In this case the sample size, or degrees of freedom, is too small for the 
normal distribution to provide an accurate approximation and the resulting margin of error is too small. 
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Question 6 


Sample: 6A 
Score: 4 


This is a complete response that is clearly communicated and the work is thorough. The response to part (a) 
correctly states the set of hypotheses. The test of inference is named as a f-test for a single sample mean, the 
appropriate values are substituted into the correct formula, and the correct p-value is reported. The correct 
conclusion is given in context of the problem and the linkage to the p-value is clear. In part (b) the response 
documents the calculation of the appropriate z-score (-1.5) and indicates that the required probability is 

P(Z 2 —1.5) = 0.9332. In part (c) the student correctly points out that the random variable in this question 


follows a binomial distribution and shows how to correctly substitute values into the binomial probability 
formula to obtain the required probability (0.4362). In part (d) the student correctly estimates the required 
probability by dividing the number of simulated samples with a minimum of at least 125 fluid ounces by the total 


number of simulated samples to obtain =. Finally, this estimate is compared to the theoretical result in the 


response to part (c), and it is noted that since the sample size was fairly large the estimate should be close to the 
theoretical probability. 


Sample: 6B 
Score: 3 


This is a substantial response that is clearly communicated, but one in which there are a few errors in the test of 
hypothesis. The response to part (a) incorrectly specifies a two-tailed alternative hypothesis. The name of the test 
and the f-score are stated correctly, but the p-value corresponds to a test against a one-sided alternative 
hypothesis, which does not coincide with the student’s original two-sided alternative hypothesis. In addition, the 
p-value corresponds to a z-test instead of a t-test with 11 degrees of freedom. The statement “We must keep H,” 


is not desirable, but the conclusion is then interpreted correctly, in context, and with a 5 percent level of 
significance. The probabilities are correctly computed, with work shown, for parts (b) and (c). The response to 
part (d) correctly uses the simulated results to estimate the required probability and makes a thoughtful 
comparison with the theoretical probability from part (c). The student attributes the small difference to sampling 
variability. 


Sample: 6C 
Score: 1 


The response in part (a) correctly provides the null and alternative hypotheses, the name of the test, and the 
correct formula with the numbers substituted. However, the f-score is computed incorrectly. The p-value does 
correspond to the student’s incorrect t-score. Using the student’s p-value, the conclusion is correct but it is not 
presented in the context of the problem. The response in part (b) indicates that the probability desired is 

P(X 2 125) when in fact P(X < 125) = 0.0668 is computed. In part (c) the student incorrectly tries to apply a 


normal probability distribution formula and gets this part completely incorrect. In part (d) the student uses the 


simulation to find P(X < 125),or = = 0.56. The student does compare this value to the theoretical probability 


that was calculated in part (c). 
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